Noisy Laplace deconvolution with error in the 
operator 



Thomas Vareschi 

Universite Denis Diderot Paris 7, Batiment Sophie-Germain, rue Alice-Domon et 
Leonie-Duquet, 75013 Paris, France 

E-mail: thomas . vareschi@univ-par is-diderot . f r 

Abstract. We adress the problem of Laplace deconvolution with random noise in 
a regression framework. The time set is not considered to be fixed, but grows with 
the number of observation points. Moreover, the convolution kernel is unknown, and 
accessible only through experimental noise. We make use of a recent procedure of 
estimation based on a Galerkin projection of the operator on Laguerre functions ([9]), 
and couple it with a threshold performed both on the operator and the observed signal. 
We establish the minimax optimality of our procedure under the squared loss error, 
when the smoothness of the signal is measured in a Lagucrre-Sobolcv sense and the 
kernel satisfies fair blurring assumptions. It is important to stress that the resulting 
process is adaptive with regard both to the target function's smoothness and to the 
kernel's blurring properties. We end this paper with a numerical study emphazising 
the good practical performances of the procedure on concrete examples. 
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1. Introduction 



Laplace deconvolution is motivated by a wide set of practical applications, ranging 
from population dynamics or physics to computational tomography or fluorescence 
spectroscopy (Linz [22, Chap. 2], Ameloot et al. [4], Comte et al. [9]). In the 
corresponding setting we observe q, the result of the action of a kernel g on the function 
of interest /, according to the following equation 

q(t) = f g(t-r)f(r)dT, t^O (1) 
Jo 

Equation (1) is also refered to as Volterra integral equation. One of its main features is 
its causal property, since q(t) is affected only by the values of / and g at times anterior 
to t. Of course, only finite samples of q(t) are accessible in practice. Moreover, the 
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presence of additional noise justifies the empirical modelization of (1) by the classical 
regression model, inspired by Abramovich et al. [3] 



where ^ t\ < ... ^ t n ^ T n are the points of observation, {r]i)i=i t ... t n are independent 
standard gaussian variables, and a is a fixed factor accounting for the precision of the 
observations. T n is supposed to grow with the number of observations n. 
As pointed out in Abramovich et al. [3] and Comte et al. [9], in spite of its apparent 
similarity with the Fourier deconvolution problem, the theoritical features of equation 
(1), as well as the practical problems raised during its resolution are deeply different. 
More precisely, setting artificially g(t) = f(t) = for t < amounts to solving the 
classical Fourier deconvolution problem 



A first notable objection is that the framework of classical Fourier deconvolution assumes 
periodicity of the function / and the kernel g on [0, T], a meaningless notion when 
applied to a varying time set [0,T n ]. Even more problematic is the fact that this 
modelization totally ignores the causal feature of Laplace convolution, creating unwanted 
interferences between different time sets. To finish, the manipulation consisting in 
artificially expanding q and g for t < creates artifacts on the estimated function 
at times t < as well. 

Another approach is to treat equation (2) as a general ill-posed problem and apply a 
Tikhonov regularization (Golubev [15]). However the direct implementation of this 
method also destroys the causal nature of equation (1), and tends to oversmooth 
the solution (Cinzori and Lamm [7]). Subsequent adaptations which remedy these 
shortcomings are present in Lamm [20] and Cinzori and Lamm [7]. However in these 
works the time set is considered to be fixed. 

A more suitable theoritical tool in solving (1) is the use of Laplace transform, which 
allows to derive a closed form of the solution. However, its direct implementation is 
compromised by numerical problems, since the generic expression of the inverse Laplace 
transform is not easily computable in general. This motivates the widespread use of 
inversion tables, unfortunately irrelevant when the image function is not known exactly 
but approximated via a numerical scheme. 

In this paper, following Comte et al. [9], we will exploit the properties of Laguerre 
functions, which can be used either to compute the inverse Laplace transform (Abate 
et al. [1], Lien et al. [21]), or to solve directly equation (1) (Keilson and Nunn [19]). More 
precisely, a Galerkin method applied to (2) shows that, even if their role is not entirely 
symmetric to the role played by harmonics in the framework of Fourier deconvolution, 
they allow a sparse analysis of equation (1). 

All the previous mentionned works only concerned the case of a deterministic noise 
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at best. The presence of random noise requires an additionnal treatment, and calls 
for specific statistical tools. In the setting of random noise, Dey et al. [11] considered a 
kernel of the form e~ at and used a regularized inversion of the inverse Laplace transform. 
More recently, Abramovich et al. [3] conceived an optimal procedure in the minimax 
sense on Holder spaces H S (M + ). This procedure used an exact expression of the solution 
involving the derivatives of q, which were then estimated via Lepskii's method. However 
a shortcoming of the procedure is its strong dependence on the kernel g, in the sense that 
a small error in g can translate into a wide difference in the result. In other words there 
seems to be a trade off between the closed form of the solution, and the unstability with 
regard to the kernel. Moreover, the fact that g is seldom observed directly in practice, 
but is usually subject to experimental noise should prompt us to privilege stability over 
exactitude. 

In that spirit, Comte et al. [9] took advantage of the algebraic properties of Laguerre 
functions in the context of (2). With an adequate penalty term, they proposed an 
estimator which mimicks the oracle risk to within logarithmic terms. This modelization 
has the non negligible advantage of practical simplicity and efficiency, since solving 
equation (1) amounts to the inversion of a lower triangular Toeplitz matrix. 
Even if this latter procedure proves to be more stable with regard to g experimentally, 
no systematic study has been conducted on the subject yet. In this paper we attempt to 
fill in this gap: we suppose that the observation of g is contaminated by a gaussian white 
noise, and show how Laguerre functions allow to handle this issue. We place ourselves 
under the minimax point of view and suppose that / belongs to a Laguerre-Sobolev 
space and that g satisfies standard blurring assumptions. We apply recent techniques 
for the treatment of noisy operators in the context of inverse problems (Hoffmann and 
Reifi [17],Delattre et al. [10]), which consist in a preliminary processing of the operator 
K coupled with a classical thresholding procedure applied to y. 

2. Discretization of Laplace deconvolution 

2.1. Laguerre functions 

Suppose that the target function / and the kernel K both lie in L 2 (IR + ). Define the 
Laguerre polynomials (see Gradshteyn and Ryzhik [16]) 



and, following Comte et al. [9], the ensuing Laguerre functions, depending on the 
parameter a > 0, 




(4) 




(5) 
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The parameter a is a tuning parameter used to fit experimental curves. The Laguerre 
functions constitute a Hilbert basis of L 2 (IR + ). Any function / e L 2 (M + ) satisfies 

rCO 

f = J]f*PM, fi= f{r)Vi{r)dT (6) 

The following proposition illustrates the conveniency of Laguerre functions in the 
framework of equation (1). 

Proposition 2.1 (Gradshteyn and Ryzhik [16], Formula 7.411.4). 

Va > 0, W > 0, f <p k (x)<p e (t x)dx = (2a)- 1 / 2 ( i p i+m (t) <p t+m+1 (t)) (7) 
Jo 

From now on, except if explicitly mentionned, we will suppose a = ^. 
2.2. Galerkin method 

Proposition 2.1 prompted Comte et al. [9] to apply a Galerkin scheme to equation 
(1). Galerkin schemes rely on the choice of a set of functions which discretize the 
inverse problem at stake in a convenient way. They were beneficially applied in the 
context of inverse problems (Cohen et al. [8]), and blind deconvolution (Efromovich 
and Koltchinskii [14], Hoffmann and ReiB [17] and Delattre et al. [10]). To this end we 
will remind briefly the underlying methodology of a Galerkin scheme and show how it 
conveniently applies to equation (1). 

Let / e L 2 (IR + ) and K an operator of L 2 (IR + ), and suppose we want to recover / 
from the observation q = Kf. Note Vt the finite dimensional space spanned by the 
orthogonal set of Laguerre functions {<£>fc}fc^£- The Galerkin approximation f e of / on 
Ve is the solution of the equation 

(Kf,v) = (g,v\ VveV e 
^ 2<i^ fc , v' k ) (f, cp k ) = (g, <p k \ W < i (8) 

We shall note K e the Galerkin matrix (K e )ij = (Kip^ ip^, i,j < I. Note hence K the 
operator of L 2 (IR + ) mapping / onto 1 1 — > ^ Q f(t — T)g(r)dr. We can reformulate (1) as 

<f = K'f (9) 

Moreover, Proposition 2.1 implies: 

Proposition 2.2 (Comte et al. [9], Lemma 1). The Galerkin matrix K e is lower 
triangular, Toeplitz. More precisely, note g the function with Laguerre coefficients 
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Then 



K 



(ho o 

9i bo 



: '■■ '•■ 

\9t ••• hi boj 

In the sequel, for any function / e L 2 (R + ), we will note T(f) the infinite Toeplitz 
matrix such that T(f) it i = f i+ i for all i ^ 0, and Ti(f) the extracted matrix defined by 
T i{f)i,i = T (f)i,ii i < ^ + 1- In particular, 



The resolution of the linear system (8) now shows great practical conveniency, provided 
that K is invertible. This is equivalent to g ^ 0, an assumption we will make in the 
sequel. 



2.3. Application to the regression model with irregular design 

It remains to incorporate two supplementary features of equation (2) in the inversion 
of (9). First, the presence of the random noise if and secondly, the possible irregularity 
of the design points. This construction is due to Comte et al. [9]. Due to the fact that 
the observation points U are imposed by the problem, the estimation of the Laguerre 
coefficients q e of the function q suffers from two potential drawbacks. First, the infinite 
support of the Laguerre polynomials as well as the function q which should not be 
too problematic, provided that T n is large enough and that the functions decrease 
sufficiently to infinity. More problematic is the fact that the observation points ti are 
sometimes subject to experimental constraints, which affect their repartition on K.+ . 
The consistency of the estimation of q e is hereby deteriorated. 
We will hence suppose that the following conditions are fulfilled: 

• There exists an integer no such that > a for all n ^ uq. 

• lim T n = oo , and lim — = 

To take into account the irregularity of the design, we follow Comte et al. [9] and define 
P n : [0; T n ] — > [0; T n ] a regular non decreasing function such that 

P n (0) = 0, P n (T n ) = T n , P n (ti) = -T n for i ^ n (10) 

Note &i the (£+ 1) x n matrix with entries (&e)k,i = ^fc^i)- For any function h e L 2 (R), 
we have 

P t h{ti) = J] <p k (U)h k = $> e h e 
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where Pi is the orthogonal projector onto VV We deduce that 

y e = (^^ey'^eP^q + arj](U) = K l f + fff**^) -1 *^ (11) 

where r) e ~ A/"(0, J n ). Let us take a closer look to the matrix Its general term 

is 

= 2 ^(p^-tO^p^-pO) 

Vk {p-\r))^,{p-\r))dr 
(Pk {T)tp t (T)P'(T)dT 

for n,T n large enough. If the points £j are equispaced, taking P(t) = r in (10) entails 
that T re n _1 is close to the identity provided that T n is large enough. As in Comte 

et al. [9], we hence reformulate (11) as the sequential model 

y* = K*f + oMt £ 
V n 

where £ e ~ A/"(0, fl^) and = nT^^&t&A 1 . In general, somehow quantifies the 
distance to the uniform design case. To ensure that the design is not too ill conditionned, 
we will suppose that the following assumption is fulfilled. 

Assumption 2.3. Let L e N. There exists C ^ 0, such that for all i < L, for all 
A e Sp(n e ), A < C 

This assumption is dependent on the integer L, which plays the role of a maximal 
resolution level, and will be adapted to the case of interest later. The inversion of (11) 
now requires controls of the variable (-R^) -1 ^. Under suitable properties of / and g, 
we shall be able to apply a classical inverse/thresholding procedure, and derive rates of 
convergence over specific regularity spaces. These properties are the subject of Part 3. 

2-4- Error in the operator 

We already mentionned the fact that the resolution of (1) is usually unstable with 
respect to g (Abramovich et al. [3]). Furthermore, in practice, inference on the kernel g 
is possible only through experimental noise, and requires a preliminary step of estimation 
giving way to imprecision. This additionnal error might significantly contaminate 
the result of any procedure of estimation if not properly treated. Let us see how 
Laguerre functions ip e allow to handle this issue: in section 2.2, we established that the 
discretization of (13) with Laguerre functions involved a Toeplitz matrix with entries 
constituted of the Laguerre coefficients of g. We can thus consider g as the finite impulse 
response of the operator K when applied to the system ((p e )^ . To take into account 
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the imprecision in the observations of g, we adopt the framework of blind deconvolution 
and suppose that g is not known exactly, but that we have acces the noisy version 

98 = 9 + 5b (12) 

where 6 is a gaussian white noise on L 2 (M + ). The generic problem of blind deconvolution 
is motivated by numerous scientific fields, including for example electronic microscopy 
or astrophysics, where the corresponding kernel is seldom known nor directly observed. 
It was adequatly discussed in Efromovich and Koltchinskii [14] and Hoffmann and ReiB 

[17]- 

Taking into account the observations (12), the projection g e is changed to g s = g l + 8b e 
where b e is a gaussian vector with covariance It. The new model, adjusted from (11) 
becomes 

V = K e f + a*f^£ f 

j v n n 

K\ =K e + 5B e 

where B e = T e (b) is a random Toeplitz matrix. In the sequel, for the sake of clarity, we 
note e = a* ' 



Remark 2.4. We could as well suppose that we observe g$ = g + 5b, yet it is more 
convenient to work with g (the entries of the noisy Toeplitz matrix B are directly i.i.d 
standard gaussian variables). In the former case, the rest of the paper however adapts 
with no change in the algorithms, since inequality (27) is satisfied as well. A modification 
of the proof of Theorem 4-6 should also provide the lower bound for the second procedure. 

3. Features of the target function and the kernel 

3.1. Sobolev spaces associated to Laguerre functions 

We proceed to the description of regularity spaces associated with the resolution of (13). 
The following material is classical, we refer to Bongioanni and Torrea [5] or Rathnakumar 
[26] for example. 

Since / >-* \/2a/(2a.) is an isometry of L 2 (M + ), the structures defined for different 
values of a are equivalent. Hence we shall only concentrate on the mainstream case 
where a = 1/2. Define the operator £ on L 2 (IR + , dx) by 



d d x 
dx 2 dx 4 



(14) 



The functions ip e are the eigenfunctions of £ associated with eigenvalues (£ + \). We 
hence define the Sobolev space W s associated with Laguerre functions as 

W s ={f e L 2 (M+, dx) s.t. £7 e L 2 (R+, dx)} 
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For a function / e L 2 (M + , dx), we have the straightforward equivalence 



and the associated norm 



For M ^ 0, we shall note W S (M) the Sobolev ball of radius M. Finally, we remind 
that, as oo ^ 1 for all £ ^ 0, we have s > 1/2 => W s c C°(M+). From now on, we 
will hence suppose that there exists s > 1/2 such that / e W s . 

3.2. Banded Toeplitz matrices 

Before entering into details about the kernel features, we introduce basic material on 
Toeplitz matrices. Most of it is inspired by Bottcher and Grudsky [6] and Comte et al. 
[9]. 

Let a = {ai) 6 be a sequence of real numbers. We remind from section 2.2 that 

we note T(a) the infinite Toeplitz matrix defined by 



T(a) 



( a a_i a_ 2 \ 

a 2 ai a 

\; •. •. •. .../ 

and Ti(a) e Mg(M.) the truncated Toeplitz matrix defined as 

(T e (a)) hJ = (T(a)) hJ , i,j^£ + l 

The Toeplitz matrices T(a) and Tg(a) are naturally linked to the two respective Laurent 
series 

oo i 

a{z) = ^ a k z k and a t (z) = ^ a k z k 

k= — oo k=—t 

We will indifferently refer to the vector a or the corresponding Laurent serie. The 
spectral norm of T(a) is related to the behaviour of a(z), as illustrated in the following 
proposition. 

Proposition 3.1. Let a e £ 1 (Z). Let C stand for the complex unit circle. We have 

\\T(a)\\op = ||a(V)|| circ 

oo 

where \\a(z)\\ circ = sup| ^ d£z e \. A simple corollary is the following inequality 



zeC 



e=-<x> 



\\T(a)\\ op < £ 



^=-00 
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In particular, Proposition 3.1 applies to the case of truncated Toeplitz matrices 
Ti(a). Moreover, if a has no zero on the complex unit circle, we have 

lim sup || T £ (a) || op < go and lim ||T^(a)|| op = ||T(a)|| op (15) 

Now suppose that a and a' both generate lower triangular Toeplitz matrices (i.e. 
a k = o! k = if k < 0). Then the following equalities hold for all £ ^ 0: 

T e (a)T e (a') = T,{a')T,{a) = T t {aa') and T e (a)~ l = 7}(l/a) (16) 

In other words, the matrix multiplication (resp. inversion) is equivalent to a power serie 
multiplication (resp. inversion). 

3.3. Degree of ill posedness 

We now need to precise the properties of K as a blurring operator of L 2 (R + ). Usually 
the operator K is not compact, and the problem (1) is ill-posed. This results in 
practical unstabilities when trying to invert equation (11) from discrete observations. 
The quantification of the ill-posedness of the problem is specified by the introduction 
of a constant, called degree of ill-posedness (DIP) of the problem (see Nussbaum and 
Pereverzev [25], Mathe and Pereverzev [23] for a generic review). We adapt this concept 
to our framework, and make the following assumption. 

Assumption 3.2 (Degree of ill-posedness of g). There exists v ^ 0, Q ^ such that, 
for all i > 0, 

||(J*V| ^Q(^v \y 

v is called degree of ill-posedness of g (or equivalently of K). We note K, U (Q) the set of 
functions which satisfy this assumption. 

We shall see examples of kernels satisfying this assumption further. For the moment, 
we concentrate on the treatment of observations (13) in the context we just described. 

3.4- Algorithms and rates of convergence 

The main challenge which remains to be treated now is to articulate the two critical 
steps of inversion and regularization, via adapted procedures. For example, let us give 
a brief overview of the methodology in Comte et al. [9]: Let t e N, and let A be the 
following contrast function, defined on Mr by 

A:t^\\t\\ 2 -2(t,(K e )- 1 y i ) 

Note ||.|| op the spectral norm and ||.||hs the Hilbert Schmidt norm. A model selection is 
performed on the maximal level L, by introducing the following penalizing factor (B > 
is an arbitrary constant): 

pen(£) = ^^((l + B)\\^Q% S + (1 + B)~\v + l)||^||^log^ 



Noisy Laplace deconvolution with error in the operator 



10 



where Q l = (K^Qt^K 1 )- 1 and -jQ l is a lower triangular matrix satisfying 
vQ^'-y^ = Q e . The maximal level L is hence chosen as 

L = argmin{A 2 ((K^)- 1 ^) + pen(£)} 

where £(n) is a large enough resolution level, possibly depending on n, and the ensuing 
estimator of / is 

We follow here a different path: we suppose that the target function belongs to a 
Sobolev-Laguerre space, and perform thresholding techniques in a minimax framework. 
Furthermore, our results are asymptotic with regard to e, 5. Would g be known, the 
estimation of / from observations (13) amounts to solving a standard inverse problem 
with signal noise. To this end, a prolific litterature is at disposal (a selected list is Donoho 
[12], Abramovich and Silverman [2], Cohen et al. [8]). In order to take into account 
the presence of noise in the operator, we shall hence apply a preliminar regularizing 
thresholding procedure to the noisy operator K$ in order to ensure the stability of the 
further inversion step. To that end, define the maximal level as 



L 1 = A (s-s/lfo^l v S\ log 5|) v+1 (17) 

with A a positive constant. Define also the two thresholding levels 

O t , s =k{{Iw 1) log(£ v 2)) 1/2 5^J\^8\ (18) 
S l i>n = (i v 1)" (r sifl eV|loge| v r op S\ log 6\) (19) 

For I ^ 0, note C, t = ((K\)~ x 1 s .. t , „ ix y e , cp e }. The estimator / of / is defined 

\\\( K S> «°P <(J 5,l S 

by 

f = 2 ^ 1 {\Ct\>St} ( Pt 

We call this procedure Algorithm I. The preliminary threshold performed on (K^)^ 1 
ensures its proximity with (K e )~ 1 with high probability (see Lemma 6.2). We now 
study the squared loss performance of the procedure. 

Theorem 3.3. Let M ^ 0, s > 1/2. Let v 5s 0, Q ^ 0. Suppose that Assumption 2.3 
holds for L = L 1 . Then for sufficiently large thresholding constants k, r S i g and r op . 



sup E!|/ J -/||<(5|log5|) 2( ^ )+1 v(.^loF 

ge>C„(Q) 

where < means inequality up to a constant depending only on A, k, r S i g , r op , s, M, is, Q. 



Noisy Laplace deconvolution with error in the operator 



11 



The rates in Theorem 3.3 reveal two components, accounting respectively for the 
imprecision in the observation of the operator and the signal. The latter is fairly classic 
in non parametric statistics (Nussbaum and Pereverzev [25], Johnstone et al. [18]) where 
it is also optimal, while the former is standard (and optimal too) in blind deconvolution 
on Hilbert spaces (Efromovich and Koltchinskii [14], Hoffmann and ReiB [17]). Thus, 
we do not study the optimality of these rates in this paper, but rather concentrate on a 
more specific framework related to the problem of interest. 

4. Adaptation to the standard framework of Laplace deconvolution 

We now discuss the adapation of our algorithm in the mainstream framework of Laplace 
deconvolution, as exposed in Abramovich et al. [3] or Comte et al. [9]. As we shall see, 
this more restrictive framework allows to treat observations (13) more efficiently. To 
this end, we first define a more restrictive version of the degree of ill-posedness. 

Assumption 4.1 (Second kind degree of ill-posedness). Note 7 fc = ((l/<jr), (p k }, so that 
(l/g)(z) = 2^ lk zk ■ There exists v > 0, there exists Q 2 , Qi > 0, such that for all £ ^ 0, 

Y i ~rUQ2(£vir- 1 (20) 

fc=0 
I k 

2 2 -£> Qi(! v if (21) 

fc=0n=0 

For Q = (Qi,Q2), we note Q V (Q) the set of functions g e L 2 (IR + ) such that 
Assumption 4.1 holds. Note that the validity of this assumption automatically entails 
Qi ^ (l + ^r)Q2- Note also that the left term in (21) is the Hilbert- Schmidt norm of 
(K^)^ 1 . Thus, Assumption 4.1 is more restrictive that Assumption 3.2. However, it is 
satisfied by a natural class of functions g: 

Proposition 4.2 (Comte et al. [9], Lemma 3/ Lemma 5). Suppose that there exists 
C, v > 1/2, \i 6 C and w(z) = YliLi( z ~ Z 2 *); I/ 2 *! > 1 a polynomial function with no pole 
inside of the complex unit disc, such that 

g(z) = Cw(z)(v-zY (22) 

Then Assumption 4-1 is satisfied. Furthermore, if w = 1 and v ^ 0, then j-y^l ~ f^y- 

For completeness, we give a proof of Proposition 4.2 in section 6. We now turn to 
the standard framework of Laplace deconvolution, as exposed in Abramovich et al. [3] 
and Comte et al. [9]. To this end, we define the following assumptions concerning the 
kernel g. 

Assumption 4.3. 
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(Al) There exists an integer r ^ 1 such that 

dig, fo s/j = 0,l,...,r-2 

dt j li=0 {B r ^0 ifj = r -l 

(A2) g e -^ 1 ([0, +00)) is r times differentiable and g^ e -/^([O, +00)). 

(A3) The Laplace transform of g has no zeros with non negative real parts except for the 
zeros of the form go + ib. 

The consequences of these assumptions are well formulated in the terms of the 
preceding framework: 

Proposition 4.4 (Comte et al. [9], Lemma 3). Suppose that Assumptions (Al), (A2) 
and (A3) hold. Then the hypotheses of Proposition 4-2 are satisfied with \i = 1, v = r. 

Hence, Assumption 4.1 is verified with v = r and Algorithm I applies. However, 
Assumption 4.1 provides additional information on the behaviour of We adapt 

Algorithm I to this new framework, by operating the following changes: 

• Set the maximal level to 

L u = A (eV|loge| v S\ log S\\ " 

• Set the signal thresholding level to 

j iKf^)- 1 ^ v l)-V 2 (r sig s^\]^F\ v r op S\logS\) if I (K e s )-i || op < Oj} 



gll \ y °"a Vl o i yjy - \ -o - \ j 1 \ 0/ II «,0 (23) 

4) 1 Hop ^ o e l 



+oo if iKJir^iioD^or 1 



where ||-<4|hs = y Tr(*^4A) is the Hilbert-Schmidt norm. We call the modified procedure 

-ii 

Algorithm II and note / the corresponding estimated function. A notable gain of this 
new algorithm is its independence with regard to the parameter v. Indeed, Assumption 
4.1 allows us to use || ( 1 j| hs i n (19) as a substitute of and to overesimate the 
'true' maximal level L l . Its performances are exposed in the next theorem: 

Theorem 4.5. Let M ^ 0. Let v > 0, Q%,Q\ > and s > 1/2. Suppose that 
Assumption 2.3 holds with L = L n . Then for sufficiently large thresholding constants 
fc-i 7~sig and T p, 

sup E||/ JI -/|| < (5\\og5\)~ v (e^y^F\)~ 

where < means inequality up to a constant depending only on A, k, r sig , r op , s, M, v, Qi, Q2- 

Thus, in addition to the adaptivity over the parameter u, the strengthening of 
Assumption 3.2 via (20) and (21) allows to improve on the rates of Theorem 3.3 
with regard both to the operator and signal noise. Our next result shows that the 
rate achieved in Theorem 4.5 is indeed optimal, up to logarithmic terms. The lower 
bound will not decrease for increasing noise levels 6 and e, whence it suffices to provide 
separately the cases 5 = and e = 0. 
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Theorem 4.6. Let s > 1/2, let M ^ v > 1/2 and Q 2 > c u Q 1 > 0. Here c v is a 
constant depending only on v which will will not seek to precise. We have 

~ s i s | 

inf sup E||/ — /|| > 5 s + v | logo| v s»+ v \\oge\ 

f /eW s (M) 
geS„(Q) 

where the infimum is taken among all estimators f of f based on observations (13). 

Combining Theorem 3.3 together with Theorem 4.6, we conclude that our algorithm 
is minimax over W S (M) to within logarithmic terms in e and 5, uniformly with regard 
to the blurring kernel g e Q U (Q). 

5. Practical performances 

In this section we study the practical performances of the two procedures developped 
above. Note that three potential sources of errors may contaminate the quality of the 
observations in (13) : the signal precision ay^, the operator precision 5 and the design 
quality ||fig|| op . We shall hence emphasize their influence in the estimation of /, as well 
as their respective interactions. 

Our first aim is to study the interaction between the effect of signal and kernel noise in 
the two procedures of reconstruction. To this end, we will isolate them from the effect 
of the design, and suppose that the latter is ideally conditionned by setting Q e = I e . 
Let us start by a few precisions concerning the tuning parameters of Algorithm I and 
II. The setting up of these procedures requires the preliminary definition of A, k, r sig 
and T op . 

Tuning parameters: for the definition of the maximal level of resolution, we set A = 1 
for both algorithms. The concrete choice of adequate thresholding constants k and r is 
a complex issue. Our practical choices will be based on the following remark, inspired by 
Donoho and Johnstone [13]: in the case of direct estimation on real line, the universal 
threshold which is both efficient and simple to implement, takes the form 2-^/| log e | . 
A consistent interpretation is to consider that this threshold should kill any pure noise 
signal. We will adapt this reasoning to the case of interest. 

Choice of k : we use as a benchmark the case where g = 0. Given 8 large enough, we 
define k as the smallest value kx such that , for all I < 10, l fll ,„,. 1N __ n = 0. The 

results are reported in Table 1 and give k = 0.3. 

Choice of T S i g and r op : It is clear that the role of r S i g and r op is to control the influence 
of the signal (resp. the operator) error. To choose r sig (resp. r op ), we therefore set 
£si g > &sig > (resp. 5 op > e op > 0) large enough. We resort to the case / = as a 
benchmark: we have (f,(p e ) = for £ ^ 1, consequently the observations (g esis ,<p i ), 
i ^ are pure noise. We hence simulate Ks gig and, integrating the precedently computed 
value of k, apply the procedure for increasing values of r sig (resp. r op ) until all the 
computed coefficients (/ , c^y) (i =1,11) are killed for £ < 10. The results are reported 
in Table 2. 
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K 


0.1 


0.2 


0.3 


N 


3 


1 






Table 1. Choosing of k. N is the average number, computed on a basis of 10 
realizations, of levels i < 10 such that || ( ^FC^ ) 1 1| op < 0^](k). We have 8 = 10~ 2 . 



T~sig 


0.3 


0.4 


0.5 


0.6 


0.7 


0.8 


0.9 


1 


Top 


0.1 


iVi 


1 


1 

























N u 


7 


5 


4 


3 


2 


1 


1 





N u 






Table 2. Choosing of r. For (S S i g , e S i g ) = (£ op ,<5op) = (10~ 2 , 10 _1 ) and each value of 
t, we computed 10 times the described procedure and reported Ni the average number 
of remaining Laguerre coefficients for Algorithm i. 




(a) Target function f 1 (b) Kernel g (c) q = Kf 

Figure 1. Datas and noisy observations of g and q 

We now apply the two procedures to the case where fi(t) = (t 2 — t) exp(— t) and 
g = ip (a graphical representation of these two functions is presented in Figure 1). 
We have 

{l/g)(z) = (l-zy 1 = Y j z e 

hence Assumptions 3.2 and 4.1 are both satisfied taking v = 1. For several values of e 
and 5, we report the corresponding squared loss, computed on a basis of 500 realisations 
with the use of Parseval's identity, in Table 3. The corresponding results are presented 
in Figure 2 for one particular realization of b. The results indicate that the transition 
on the two types of errors occur when 5 is higher than e, translating a prevailing effect 
of the signal noise e over the operator error S in practice. As Theorems 3.3 and 4.5 
suggest, the second Algorithm overperforms the first in (almost) every case. 

Discussion on the design irregularity: to control the squared risk of the two 
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Algorithm I 


Algorithm II 


e 

A 





10~ 3 


10- 2 


3.10- 2 





10~ 3 


10- 2 


3.10- 2 








0.020 


0.141 


0.348 





0.012 


0.109 


0.312 


10~ 3 


0.004 


0.020 


0.141 


0.352 


0.005 


0.012 


0.108 


0.301 


10- 2 


0.047 


0.054 


0.143 


0.344 


0.053 


0.039 


0.116 


0.318 


3.10- 2 


0.170 


0.169 


0.190 


0.348 


0.118 


0.109 


0.145 


0.324 



Table 3. Normalized mean squared error of the two procedures applied to the functions 
/j and g. The computations were performed using a monte carlo method on 500 
realizations. 




02468 10 02468 10 



(a) Algorithm I (b) Algorithm II 

Figure 2. Estimation of f 1 for predominant signal noise (e,6) = (10~ 2 , 10 -3 ) and 
predominant operator noise (S, e) = (10 -3 ,10~ 2 ). 



procedures, one needs condition 2.3 to be fulfilled. If not, the eigenvalues of the matrix 
Ql become potentially too large, and observations (11) are not conveniently treatable. 
In this case, it is preferable to lower the maximal level down to a point where ||f2zj p 
remains under control. To this end, we change the maximal level of the two respective 
procedures to 

N* = V a max{£ ^ s.t. ||f^|| op < a}, i = I, II 

where a is an arbitrary thresholding constant, set to 1.5 in the sequel. We now fix 
a = 5 = 10~ 2 and chose the design points ti as ti = lOOz/n for n = 200,250,750 
and 1000. Taking the same kernel g = (p Q , and setting f 2 (t) = (t 1 ^ 2 — t) exp(— t), we 
compare the performances of the new choice iV* to the previous one L l , by computing 
the respective mean squared losses on a basis of 500 observations and report the result 
in Table 4. The results show a minor effect of the design ill-posedness on Algorithm I, 
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n = 
_ ^ 


?oo 

z^w w 


250 


500 

W WW 


750 

1 WW 


Algorithm I 




(6,6) 


(6,6) 


(6,6) 


(6,6) 


MSE, L 1 


0.273 


0.270 


0.264 


0.258 


MSE, N 1 


0.275 


0.272 


0.264 


0.257 


Algorithm II 


(L U ,N U ) 


(37,12) 


(37,15) 


(37,27) 


(37,27) 


MSE, L n 


1.336 


0.559 


0.289 


0.253 


MSE, N u 


0.294 


0.291 


0.284 


0.256 



Table 4. Normalized mean squared error of the two procedures when the design 
is constituted of 200 cquispaced points on the interval [0;100]. We compare the 
performances of the two maximal resolution levels L l and N l for the parameters 
a = 5= 10- 2 , g = <p and f 2 (t) = {t 1 ! 2 - t) exp(-t). 



0.25 
0.2 

0.15 
0.1 

0.05 


-0.05 



I 
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— L'=6 

(L')"=6 

Target function 











1.5 



0.5 





---L"=37 

N M =12 

Target function 






! \ \ \ ! 







10 



2 4 6 8 10 



(a) Algorithm I (b) Algorithm II 

Figure 3. Result of the two different maximal levels U and N 2 to estimate f 2 , for a 
particular realizaton of b and The design is constituted of 200 equidistant points of 
observations in [0; 100]. The noise levels are a = 5 = 10 -2 . 



since L % is usually already smaller than iV\ However, the gain is notable for Algorithm 
II when n < 250. To illustrate this point, we plot in Figure 3 the corresponding results 
when n = 200. 

Back to the regression model 

We now turn back to the original model (2) to apply the two procedures. It is well 
known that this model is asymptotically equivalent to (11), in the sense that a fine 
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Figure 4. Adaptation of the procedures to the regression framework. Here, MSE 
denotes the normalized mean squared error for each algorithm (computed with 500 
realizations). The target function is / 3 and the noise levels are a = 8 = 5.10" 2 . 



enough design will provide an estimation of the Laguerre coefficients with a negligible 
error when n —*■ go. We work with f 3 (z) = (1 — z) 1 / 2 , g = ip , 5 = 10~ 2 , and suppose 

i 

that the design is constituted of the points tj = ^(step + \Xj\) where (Xj) j< n is an 1.1. d. 

sequence of 7V(0, 10 -2 ) variables. We observe the noisy values y{ti) = q(tj) + or\i where 
q(z) = (1 — z) 3 / 2 , and compute the Laguerre coefficients q e via the approximation 

n-l 



^ 9(U)(Pi(ti) + q(t i+ i)(p e (t i+1 ) ^ _ t ^ 



i=l 

We apply the two procedures and present the results on Figure 4. 
6. Proofs 

In the sequel, for the sake of clarity, we suppose that T sig = r op = r. 
6. 1 . Proof of Proposition 4 ■ 2 

Proof. We can restrict ourselves to the case where /x = 1. Proposition 16 applied to 
equality (22) entails 

W > 0, Tt({l/g)) = C-^w-^Tzdl - z)~ v ) 
Ttdl-z)-") = CT e (w)T e ((l/g)) 
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As a consequence, 



£ il = WiVaff < c- l \\T £ {w- l )U\(i - Z )7t (24) 

fc=0 

and IKK^^IIhs > C- x \T t (w)\£\T t ((\ z)^)\\ m (25) 



Since w is assumed to have no zeros on C, we deduce from Proposition 3.1 that both 
II w" 1 j| circ and ||i/;flcirc are finite, and from (15) that 

||Z«(w _1 )||op - 1 and \\T e (w)\\ op - 1 

It remains to treat the binomial serie (1 — z)~ v . This serie can be expanded as 



sl>- v 



z 



, where = r ^ +1 jp^_ £+1 ) is the generalized binomial coefficient. Furthermore, we 
have 

/ ,.\ i i\i 

(26) 



which is a direct consequence of Euler's definition of the Gamma function V(z 



k\k z v-i /— zA 

lim —, — : r. Since v > 1/2, the serie > ( , ] is hence divergent, and there 

*-» ui ( z + i) 1 r\ k ) 

exists Q2, Qi > such that, for all t ^ 0, 

EfTV^vl) 2 *- 1 and SS(T) 2 ^^ v1 )" 

fc=0 V K ' fc=0 n=0 V " / 

The proof is complete thanks to (24) and (25). □ 
6.2. Proofs of theorems 3.3 and 4-5 

6.2.1. Preliminary lemmas We begin with the following lemmas. Lemma 6.1 is a 
concentration inequality on the variable ||S £ || op , which results from a concentration 
inequality on subgaussian processes. Lemma 6.2 states that ||(l^^) _1 || p behaves as 
|| (K^)^ 1 || p on a set with large probability. Finally, Lemma 6.3 establishes deviations 
bounds on the variables £ £ — f e which will be useful throughout the proofs of Theorem 
3.3 and Theorem 4.5. 

Lemma 6.1. There exists j3o, Co independent from I ^ 0, such that, for all £ ^ 0, for 

all t ^ (3 , 

F( . 1 \\B l \\ op > t) < exp(-c t 2 ) 

VV(^ v l)log(£v2)" P ' 

This readily entails the following moments control, available for all £ ^ ; p ^ 1 

E\\B%< {£log£) p/2 vl 
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Proof. The proof is a slight modification of Meckes [24, Theorem 1], to which we refer 
for a complete study. Lemma 6.1 is trivially satisfied if £ = 0, 1, hence we will suppose 
that £ > 2. From Proposition 3.1, we derive that 

t 

Ej|B% p < E||T((b) fc )|| op = E sup \Y X \, Y x = V 6 £ e 2i ^ 

xe[0,l] fc=0 

We claim the two following facts: 

• Let ao, a-t e R. There exists c ^ such that ofor all t > 0, 

P(| J] a fc b fc | > t) < exp (27) 

fc=0 2jfc=0 a fc 

• d(x, y) = y/E\Y x -Y y \ 2 < 4£ 3 / 2 |a; - y| a 2\Ti 

The first point is readily verified since (bfc)fc^ is a standard Gaussian vector, while the 
second point directly results from the bound 



3 2i7rfcx 2iirky\ 



< 2 a 27rA;|x — y| for all x, y e [0, 1], A; ^ 



A direct application of Dudley's entropy bound (Talagrand [27, Proposition 2.1]) now 
entails 

E sup \Y X \ < (£\og£) 1/2 

££[0,1] 

(see Meckes [24] for the rest of the proof). The deviation bound is now a consequence 
of Talagrand [27, Lemma 5.3]. Indeed, for all x e [0, 1], 



E\Y X \ 2 = E| |] 6 fe e 2i7rte | 2 < £ 



fc=0 

which ends the proof. □ 

Lemma 6.2. Let t ^ 0, ai = pO^s for some < p < \. Note 7,5(2:) = lk,& zh ^ e 

^0 

power series associated to (Kg) -1 . On Ag = {\\(Kg)~ l \\ op ^ Oj]} and Bg = {\\5B e \\ op < 
ag}, the following inequalities hold 

lOfcinU < j^-IK^)" 1 !^ and KK'rX, < (1 - p)- 1 !^)- 1 !^ (28) 

K^)- 1 !!^ < -^-k^)- 1 !^ and Ki^rV < (1 - p)- 1 !^)- 1 !^ (29) 
t -fa ^rht^^t^l^i 1 - pr 1 1 7m (3°) 

fc=0 ^ fc=0 fc=0 fc=0 
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Proof. First, we have 

On n -B^, since satisfies 0^"j = p < |, by a usual Neumann series argument, 
we have 



K*r , i,-|[2(- ,( 

^[S^llWllopH^lloJllK)" 



op 



^(i-p)- 1 !!^)- 1 ! 



op 



Secondly, we have 

{Kir 1 



(K> + SB'Y 1 - (I + HK'r'B'y^K') 



Moreover, thanks to (31), on n B^, we have 



WSiK^BX, < (1 - p)^ 1 ^- < 1 



So that we can now similarly derive 

-ii 



'°p 1 - p 



\{K') 



-1| 



op 



(31) 



(32) 



This prooves (28). The proofs of (29) and (30) follow the same lines, since ||AB||hs ^ 
||A|| op ||5|| H s,and||A6||<||A|| op ||6||. □ 



6.2.2. Proof of theorem 3.3 

Lemma 6.3. Under Assumption 3.2, we have, for all i 5* 0, 

^\{{K^\ A \ B X- SB e f + e$ L r) }( pt)\ q ] < (£ v ir(e v *)« (33) 
P(|<(^)- 1 1 A£ 1 B4 ( - 5B l f + Et Ll ),cp e )\ > < v <T (34) 

Proof. In order to prove Inequalities (33) and (34), it suffices to study the tails of the 
random variables ((K e s )~ 1 l Ae l Be ( — 5B e f e + , (fg) . For convenience we will only 
treat the case where £ ^ 2, otherwise the result follows by identical arguments. To this 
end, we study each term apart. On n Be, Lemma 6.2 and Assumption 3.2 entail 



1-p 
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Thus, combining Assumption 2.3 with the latter inequality, a brief conditionning 
argument readily yields 



P 



{\((K e 5 )- 1 l Ae l Be e^ Ll ,i Pe )\ > tj < exp ( - JL;) 



Let us study the second term. On Ai n Bi, we have 



5(K«)- l B* = J] (SiK'yWy = 5(K e )~ 1 B e + £ ^(J^) -1 ^) 

Hence, 



5(^)- 1 Ul B ^Y = r 1 + r 2 (35) 



where 



i/ 

r 2 = ((6{K e )-iB e )\l + 6{K')-iB e )- 1 f,<pd 
Let's now bound separately ri and r 2 . We first apply equality (16) to get 

where (b% = (b e ) e - k - The result is a centred gaussian variable with variance 

\\5(K")- l ff < 5 2 Q 2 M 2 ^ 

which hence satisfies 

P(|r 1 |>t)<exp(^ ; ) 

Let us study the term r 2 . Since the maximal level L verifies L < X(5\ log <5| ) — T7 + T , we 
have, for all i < L, <5f +1 log^ < 1. We deduce that 



(36) 



|r 2 | >t) ^F(6 2 £ 2v \\B e \\ 2 op >t) 
1 

'ioi7 ll±r| 



i 



sP(^IIB'li>rV-" () 

< exp(— t(S£ v ) )^-{t>j3 si"} + l(t«ft«" } 



inequality (34) directly follows, and inequality (33) is now a direct application of the 
well known formula 

E[X 2 ] = I 2tF(\X\ > t)dt 
Jt>o 



□ 
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Proof of theorem 3.3. We apply Parseval's formula to derive 



em 1 e>L T 



The second term is easily handled. Remark first that, since s > 1/2, we have 
e can write 



V+l > and we can write 



f ^ fri\- 2s 
e>L I 

^ (e^J\\oge\) 1 ^ I v (c5| log 5|) ^ 



^ (eVl lo g e l) 2(s+ " ,+1 v (5|log<5|) 2(s+ " ,+1 

In order to lighten the notations, we will only consider the indexes £ ^ 2 in the first 
term. This is of course not problematic, since an identical reasoning allows to bound 
the two remaining summands by the desired rates of convergence. We hence write the 
following decomposition 

<I + II + III + IV 



where 



7/ = E^llC^sLlW* 
JJ/ = 2 E(C, - /,) 2 l Ai l B? l { } + 2 E/, 2 1 { 

• Term I and II. On A^, we have 

- = <(^)- 1 (- SB e f + ^> (37) 
Hence we can decompose further I as 



+ 2 E<(^)- 1 l A£ l Bi (-^ + ^),^> 2 l {|C£|>sK} l {| ^ |<5L/2} 
= V + VI 
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Let us first treat the term VI. From Lemmas 6.2 and 6.3 and Cauchy-Schwarz inequality, 
we derive 



VI < J] B[({Ki)- 1 l At l Bt (6B t f t + et),<p e y 



4 l ' 2 



■n\Ce-h\>sD i/2 

< J] [(iv£)¥f /2 (^V^ 2 ) 

which is less than the desired bound for r large enough. As for term V, we split it in 
two and write 

V< J] n&5)- 1 lA t l Bt (6B'f + et),ri\ )tl >sj M 

S^(/, 2 (r5|io g 5|)- 2 Ai) 

2^(/, 2 (r £V 1^)- 2 Ai) 



^L 1 



V 

em 1 



Note 4 = (<5| log<5|) 2(s+ " )+1 and write 



2 ^(/J(^|logJ|)- a a 1) < 2 ^ + J] j^log^ 

^L 1 



< 



(^l log ^i) 



—z 

The e-term is treated similarly by taking £ £ = {s^J\ loge|) 2s+2 " +1 and leads to the desired 
convergence rate. As for the term //, a similar reasonning leads to 

//< 2 ^{K^sl} 1 ^^} + 2 E ^ 1 {lC < l<sJ,.} 1 {l/ < l>^ >e } 
=Vi7 + WJI 



The term VIII is handled as the term VI. Indeed, 

VIII < J] fX\C e U > Si) < 2 /,V v <T) 
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which is less than the desired rate for r large enough. Finally, we have 



< J] ^. 2 | loge| + J] f\ + J] «| logtf| + J] £ 
< (e^/\h^e\) J ^ TT v (S\ log^ 2 )^^ 1 



Term III. We have 

/II < ^E((( < -/,)\ + / f 2 )l B? 

< 2 e[(o-/,) 4 u] 1/2 p(^) 1/2 + 2 



□ 



^l 1 e^L 1 
Moreover, Lemma 6.1 entails 

for all £ ^ 1. It is hence clear that for « large enough, the term III is less than the 
announced rate. □ 

• Term IV. We claim that 

1 {A1) ^ 1 {||(K0- 1 ||op>O M 1 /2} + 1 {||,5B*|| op >0 £ , 4 } 

for all t ^ (see Delattre et al. [10], Lemma 5.3). Hence, 

IV < J] E/J(l {||(K£) _ 1||op ^ 1/2} + l{ i5B ,|| op ^^) 

= Wi7 + /X 

Since KKO-^lop < Qi£\ we have {K^)- 1 ^ > 0^/2} c {r+V^yl^I ^ 
c(5\ log <5 1) X } where c is a constant depending only on Q2 and ft. Hence 



VIII < 2 / 

2 

^c(<5|log<5| 3 / 2 ) 2l/+1 

< (<5| log5|) 
As for IX, a quick application of 6.1 entails 

P(||5B*|| op > O e , 5 ) < ^ 2 



2 



Noisy Laplace deconvolution with error in the operator 25 
,so that 



IX < J] /,V < 5 k2 



/ 2 

which is less than the announced rate for k large enough. □ 



It remains to put together the bounds of the four terms above to get the desired 
rates of convergence in theorem 3.3. □ 

6.2.3. Proof of theorem 4-5 

Lemma 6.4. Note, for I > ; = {I v l) u ~ 1/2 (T S i g ey/\ logej v r op 5|log5|). £/nder 
Assumption 4-1, we have, for all i ^ 0, /or a// q ^ 0, 

EfK^^UlB^-^V + ^>| 9 ] £ v 1)^-^(5 v 5)" (38) 

P(\<{K t s )-H At l Bt (-5B'f i + e^), V i>\ > S%) < v <T (39) 

Proof. The proof is very similar to Lemma 6.3, whence we will just mention the notable 
changes compared to it. Once more, we shall only treat the case t ^ 2. First, we have 

{{K" 5 )- l l Al l Bl st vi> = (s^n* {KIYH^Ib^) = e(Z L u, ((1/gsY)') 
so that a brief contitionning argument, combined with (30) and Assumption 2.3 entails 

P(|<(^)- 1 l^l B ,^ L n, <^>| >t)< exp( 



£ 2£2u-l 



In order to treat the term P(|((JiC 5 ) 1 1^1 B( ,5-B^ / , ip e )\ > t), we first establish a useful 
result for the sequel: if g satisfies (20), then 



KK^rw = \\T e (f)(i/ g y\\ < \\T e (f)U(i/ 9 y\\ 

Furthermore, thanks to Proposition 3.1 we have 

iiw)iio P < 2 \h\< E*WI! r3b * 1 

e^o e^o i^o 

since / e W S (M) and s > 1/2. We derive that 

\\(K"y l ff < I 2 "' 1 (40) 

Let us now bound the term of interest. Once more, we decompose it as r\ + r 2 where 
T\ and r 2 are defined in (36). We now apply Proposition 16 and (32) and derive 

P(|n| >t)= F(\(5(K e )- l B e l A l Be f,cp e )\ > t) 
^F(\(5(K e )- 1 f, t B^ e )\>t) 



Noisy Laplace deconvolution with error in the operator 26 

The latter is a gaussian random variable with variance 8 2 \\(K ) _1 f \\ 2 < 5 2 £ 2l/ ~ 1 where 
we used (40). Turning to the term r2, we apply Proposition 16 to derive 

P(M >t)= ¥{\<fi{B^\Ki)-H At l Bi f*(K i )- 1 V ^\ > t) 
We now apply (30) and (40) to get 
Hence, 



P(|r 2 | >*) < ¥(5 2 £ 



- X \\B l \l^ Bi >t) 



^ P (^ll^llo P U 1 B,>i(^log^)- 1 ) 
Let us take a look back to Lemma 6.2. On Ai n Bi we have prooved that 

\\{K" s )- 1 \\ m ^{l-p)Q l t 
so that A £ n B £ c {5r +1 / 2 log^ < 1 J. We deduce 



n\r2\>t)<F(jl-\\BX P >t(St 



< 



2 ^+(xr-V2-\-^ 

GX P V §£u- 1/2 ) 1 { t >fil St"- 1/2 } + 1 {^/3 2 5^-i/2| 



The end of the proof is identical to Lemma 6.3. 



□ 



Proof of Theorem 4-5. The proof is very similar to Theorem 3.3, whence we will just 
emphasize the notable changes compared to it. First, we apply Parseval's formula to 
derive 

E i/ n - f\\i = s E<? n -/,^> 2 + 2 K 

The second term is easily handled, since 

„2 . , T TTx_2s 



2 ft < (L U ] 



e>L u 

,2s 



^ (ey/\loge\)& v (<y|log«5|)' 



To bound the first sum, we write the following decomposition 

< I + II + III + IV 



'= S E (0-« 2 ^lH ( l {|((|>sn 
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where 

Hd-h) 2 - 

rtr i 2 -i 

111= ^ Efa-tfl^l^ } + J] E#l { 
Thanks to Lemma 6.2 and the definition of S}\, we have 

i - p > e e > £ i - p e ' e 

on n .B^ . Thus, the Terms I and II can be treated identically to the preceding proof 
and yield the desired rates of convergence. The terms III and IV are treated exactly as 
in the preceding proof. □ 

6. 3. Proof of theorem 4-6 

Proof. The lower bound will not decrease for increasing noise levels S and e, whence it 
suffices to provide the case 6 = and the case s = separately. In the sequel, q will 
denote a positive constant to be adjusted later and L will play the role of a maximal 
level. Also, we will note K u (resp. g v ) the operator (resp. the function) associated with 
the Laurent serie (1 — z) u L . The function g_i/ 2 will play an essential role in the sequel. 
Unfortunately it is not square integrable. We thus begin with a preliminary lemma, 
which states that a minor modification corrects this defect. 

^ '-1/2 



(—1) 

Lemma 6.5. Let h be the function associated to the Laurent serie > — — . 

^ o log(£v2)V I 

Then h is square integrable. Furthermore, for all v ^ 0, for all i < L, 

Proof of Lemma 6.5. h is trivially squared integrable thanks to (26) and Parseval's 
formula. Now, we have 

(K^h, <p £ ) = (-!)« 2 (7) (: log" 1 ((i -*)v2) 
fc=0 \ / V / 

Moreover, since the product (~ fc y ) has a constant sign for all k < £, we derive 
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but (— l Y V ( J ( ) is precisely the £ th coefficient of the power serie (1— z)~ v (\ — 

Z )-V 2 = (1 - z)- u - 1 / 2 which satisfies, thanks to (26), 



zA (-1/2 



k JU-k 



£"-1/2 



r zv + 1/2 



This entails the result. □ 

• Case 5 = 0. For more clarity, we will suppose that £ is a white noise (the proof readily 
adapts otherwise). Let hence K° = C\K V . Then K° e Q U {Q) for an appropriate 
constant ci, thanks to Proposition 4.2. Following the arguments of Wilier [28], it suffices 
to find f , f l such that 

i) / ,/ ie >V s (M) 2 

ii) ll/o - /ill 2 £ £^|loge|- 2 

iii) K(F 1 ,F 2 ) 5s 1 where is the law of y under the hypothesis f i7 and K is the 

Kullback-Leibler divergence. 

-i 

Let L = C2E a +» . Set fo = and define f 1 = c^K_ u h. 

Point i): / trivially belongs to the considered set. Moreover, Lemma 6.5 entails 

Point ii): again, thanks to Lemma 6.5, we have 

ll/o - /if * J] f// > e 2 L-(logL)- 2 > e &| log^" 2 

Point iii): the expression of the Kullback-Leibler divergence in this case is 

thanks to Lemma 6.5. The choice of appropriate constants c, clearly yields the result 
and the proof is complete. □ 

• Case £ = 0. Let L = cid 7 ^ . Following the lines of Hoffmann and Reifi [17], we set 
f = K° = czK u and we only consider couples (K, f) such that Kf = q for a 
fixed q = K°f . It is clear that, for well chosen C2 and C3, we have / e W S (M) and 
K e G U (Q). We thus define H the operator associated to the kernel h and introduce 

£ £ 

K = K u + c 4 5H a perturbation of K u . We shall refer to g for the corresponding 
kernel. Remark that we have 

A - /o = c,5{K 6 )- l Hf Q = c^K^h (41) 
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Furthermore, for C4 small enough, we have thanks to Lemma 6.5 and Proposition 4.2, 

K 5 - K°\ op < 5L l/2 < S^^y < - 

II Hop ~ 2 

since s > 1/2. Hence, the same Neumann serie arguments as in Lemma 6.2 entail that 
K s belongs to Q V (Q). We now need to check that i), ii) and iii) are satisfied, replacing 
e with 5. 

Point i) : (41) and the preceding remark entail 

ll/l - /oik = ^ 2 ^({^h, ^) * £ ^ +2 ^ 1 * 1 

Point ii) : we precise (41) and write 

f 1 f = c A 5K- l h + cl5 2 (K 5 )- l K- l Hh 

Moreover, Lemma 6.5 and the preceding remark entail 

jSK^Hfof = \\5K- l hf >5 2 Y, fU \ > 5 2 L^(logL)- 2 > «^|logC 2 
^(K^K^&fvW 2 < 5 4 |(lf*)- 1 ||Hs||-K'~ 1 -H r ||Hs|'i|| ^ <5 4 L 4i/+1 < 5^S 2 L 2u 



Since s > 1/2, the second term is negligible with respect to the first. This proves 
the point ii). 

Point iii) : Since we work with couples (K, f) such that Kf is fixed, we have 

*(Po,Pi) = ^V-flJ J = yN^l 

thanks to Lemma 6.5 and the proof is complete. □ 

It remains to piece together the two cases 5 = and e = to get the desired 
result. □ 
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